sar imagery
SARVLM: A Vision Language Foundation Model for Semantic Understanding and Target Recognition in SAR Imagery
Ma, Qiwei, Wang, Zhiyu, Liu, Wang, Lu, Xukun, Deng, Bin, Duan, Puhong, Kang, Xudong, Li, Shutao
Synthetic Aperture Radar (SAR) is a crucial imaging modality thanks to its all-weather capability. Although recent advances in self-supervised learning and masked image modeling (MIM) have enabled SAR foundation models, these methods largely emphasize low-level visual features and often overlook multimodal alignment and zero-shot target recognition in SAR imagery. T o address this, we construct SARVLM-1M, a large-scale vision-language dataset with over one million image-text pairs aggregated from existing datasets. W e further propose a domain transfer training strategy to mitigate the large gap between natural and SAR imagery. Building on this, we develop SARVLM, the first vision language foundation model (VLM) tailored to SAR, comprising SARCLIP and SARCap. SARVLM is trained with a vision-language contrastive objective under the proposed domain transfer strategy, bridging SAR imagery and textual descriptions. Extensive experiments on image text retrieval, zero-shot classification, semantic localization, and imagery captioning demonstrate that SARVLM delivers superior feature extraction and interpretation, outperforming state-of-the-art VLMs and advancing SAR semantic understanding. Code and datasets will be released soon.
- Asia > China (0.04)
- North America > United States > Colorado (0.04)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.89)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Latent space analysis and generalization to out-of-distribution data
Rainey, Katie, Hausmann, Erin, Waagen, Donald, Gray, David, Hulsey, Donald
Understanding the relationships between data points in the latent decision space derived by the deep learning system is critical to evaluating and interpreting the performance of the system on real world data. Detecting \textit{out-of-distribution} (OOD) data for deep learning systems continues to be an active research topic. We investigate the connection between latent space OOD detection and classification accuracy of the model. Using open source simulated and measured Synthetic Aperture RADAR (SAR) datasets, we empirically demonstrate that the OOD detection cannot be used as a proxy measure for model performance. We hope to inspire additional research into the geometric properties of the latent space that may yield future insights into deep learning robustness and generalizability.
Domain Adaptive SAR Wake Detection: Leveraging Similarity Filtering and Memory Guidance
Gao, He, Huang, Baoxiang, Radenkovic, Milena, Li, Borui, Chen, Ge
Synthetic Aperture Radar (SAR), with its all-weather and wide-area observation capabilities, serves as a crucial tool for wake detection. However, due to its complex imaging mechanism, wake features in SAR images often appear abstract and noisy, posing challenges for accurate annotation. In contrast, optical images provide more distinct visual cues, but models trained on optical data suffer from performance degradation when applied to SAR images due to domain shift. To address this cross-modal domain adaptation challenge, we propose a Similarity-Guided and Memory-Guided Domain Adaptation (termed SimMemDA) framework for unsupervised domain adaptive ship wake detection via instance-level feature similarity filtering and feature memory guidance. Specifically, to alleviate the visual discrepancy between optical and SAR images, we first utilize WakeGAN to perform style transfer on optical images, generating pseudo-images close to the SAR style. Then, instance-level feature similarity filtering mechanism is designed to identify and prioritize source samples with target-like distributions, minimizing negative transfer. Meanwhile, a Feature-Confidence Memory Bank combined with a K-nearest neighbor confidence-weighted fusion strategy is introduced to dynamically calibrate pseudo-labels in the target domain, improving the reliability and stability of pseudo-labels. Finally, the framework further enhances generalization through region-mixed training, strategically combining source annotations with calibrated target pseudo-labels. Experimental results demonstrate that the proposed SimMemDA method can improve the accuracy and robustness of cross-modal ship wake detection tasks, validating the effectiveness and feasibility of the proposed method.
- Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
- Asia > China > Shandong Province > Qingdao (0.05)
- Atlantic Ocean > North Atlantic Ocean > Baltic Sea (0.04)
- (7 more...)
- Europe > Iceland (0.04)
- North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
- North America > United States > Massachusetts > Middlesex County > Reading (0.04)
- (11 more...)
- Food & Agriculture > Fishing (1.00)
- Transportation (0.94)
- Government > Regional Government > North America Government > United States Government (0.93)
- Energy > Renewable (0.71)
Quantitative Comparison of Fine-Tuning Techniques for Pretrained Latent Diffusion Models in the Generation of Unseen SAR Images
Debuysère, Solène, Trouvé, Nicolas, Letheule, Nathan, Lévêque, Olivier, Colin, Elise
We present a framework for adapting a large pretrained latent diffusion model to high-resolution Synthetic Aperture Radar (SAR) image generation. The approach enables controllable synthesis and the creation of rare or out-of-distribution scenes beyond the training set. Rather than training a task-specific small model from scratch, we adapt an open-source text-to-image foundation model to the SAR modality, using its semantic prior to align prompts with SAR imaging physics (side-looking geometry, slant-range projection, and coherent speckle with heavy-tailed statistics). Using a 100k-image SAR dataset, we compare full fine-tuning and parameter-efficient Low-Rank Adaptation (LoRA) across the UNet diffusion backbone, the Variational Autoencoder (VAE), and the text encoders. Evaluation combines (i) statistical distances to real SAR amplitude distributions, (ii) textural similarity via Gray-Level Co-occurrence Matrix (GLCM) descriptors, and (iii) semantic alignment using a SAR-specialized CLIP model. Our results show that a hybrid strategy-full UNet tuning with LoRA on the text encoders and a learned token embedding-best preserves SAR geometry and texture while maintaining prompt fidelity. The framework supports text-based control and multimodal conditioning (e.g., segmentation maps, TerraSAR-X, or optical guidance), opening new paths for large-scale SAR scene data augmentation and unseen scenario simulation in Earth observation.
- Europe > France (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Israel > Southern District (0.04)
A Deep Learning framework for building damage assessment using VHR SAR and geospatial data: demonstration on the 2023 Turkiye Earthquake
Russo, Luigi, Tapete, Deodato, Ullo, Silvia Liberata, Gamba, Paolo
Building damage identification shortly after a disaster is crucial for guiding emergency response and recovery efforts. Although optical satellite imagery is commonly used for disaster mapping, its effectiveness is often hampered by cloud cover or the absence of pre-event acquisitions. To overcome these challenges, we introduce a novel multimodal deep learning (DL) framework for detecting building damage using single-date very high resolution (VHR) Synthetic Aperture Radar (SAR) imagery from the Italian Space Agency (ASI) COSMO SkyMed (CSK) constellation, complemented by auxiliary geospatial data. Our method integrates SAR image patches, OpenStreetMap (OSM) building footprints, digital surface model (DSM) data, and structural and exposure attributes from the Global Earthquake Model (GEM) to improve detection accuracy and contextual interpretation. Unlike existing approaches that depend on pre and post event imagery, our model utilizes only post event data, facilitating rapid deployment in critical scenarios. The framework effectiveness is demonstrated using a new dataset from the 2023 earthquake in Turkey, covering multiple cities with diverse urban settings. Results highlight that incorporating geospatial features significantly enhances detection performance and generalizability to previously unseen areas. By combining SAR imagery with detailed vulnerability and exposure information, our approach provides reliable and rapid building damage assessments without the dependency from available pre-event data. Moreover, the automated and scalable data generation process ensures the framework's applicability across diverse disaster-affected regions, underscoring its potential to support effective disaster management and recovery efforts. Code and data will be made available upon acceptance of the paper.
- Asia > Middle East > Republic of Türkiye > Kahramanmaras Province > Kahramanmaras (0.06)
- Asia > Middle East > Republic of Türkiye > Osmaniye Province > Osmaniye (0.05)
- Asia > Middle East > Syria (0.05)
- (9 more...)
- Government > Space Agency (0.69)
- Government > Regional Government (0.48)
- Materials > Construction Materials (0.46)
- (2 more...)
Benchmarking Suite for Synthetic Aperture Radar Imagery Anomaly Detection (SARIAD) Algorithms
Chauvin, Lucian, Gupta, Somil, Ibarra, Angelina, Peeples, Joshua
Anomaly detection is a key research challenge in computer vision and machine learning with applications in many fields from quality control to radar imaging. In radar imaging, specifically synthetic aperture radar (SAR), anomaly detection can be used for the classification, detection, and segmentation of objects of interest. However, there is no method for developing and benchmarking these methods on SAR imagery. To address this issue, we introduce SAR imagery anomaly detection (SARIAD). In conjunction with Anomalib, a deep-learning library for anomaly detection, SARIAD provides a comprehensive suite of algorithms and datasets for assessing and developing anomaly detection approaches on SAR imagery. SARIAD specifically integrates multiple SAR datasets along with tools to effectively apply various anomaly detection algorithms to SAR imagery. Several anomaly detection metrics and visualizations are available. Overall, SARIAD acts as a central package for benchmarking SAR models and datasets to allow for reproducible research in the field of anomaly detection in SAR imagery. This package is publicly available: https://github.com/Advanced-Vision-and-Learning-Lab/SARIAD.
- North America > United States > Texas > Brazos County > College Station (0.15)
- Asia > Middle East > Syria (0.04)
- Asia > Middle East > Republic of Türkiye (0.04)
- Asia > India > Andhra Pradesh (0.04)
- Energy (0.96)
- Government > Regional Government > North America Government > United States Government (0.46)
BRIGHT: A globally distributed multimodal building damage assessment dataset with very-high-resolution for all-weather disaster response
Chen, Hongruixuan, Song, Jian, Dietrich, Olivier, Broni-Bediako, Clifford, Xuan, Weihao, Wang, Junjue, Shao, Xinlei, Wei, Yimin, Xia, Junshi, Lan, Cuiling, Schindler, Konrad, Yokoya, Naoto
Disaster events occur around the world and cause significant damage to human life and property. Earth observation (EO) data enables rapid and comprehensive building damage assessment (BDA), an essential capability in the aftermath of a disaster to reduce human casualties and to inform disaster relief efforts. Recent research focuses on the development of AI models to achieve accurate mapping of unseen disaster events, mostly using optical EO data. However, solutions based on optical data are limited to clear skies and daylight hours, preventing a prompt response to disasters. Integrating multimodal (MM) EO data, particularly the combination of optical and SAR imagery, makes it possible to provide all-weather, day-and-night disaster responses. Despite this potential, the development of robust multimodal AI models has been constrained by the lack of suitable benchmark datasets. In this paper, we present a BDA dataset using veRy-hIGH-resoluTion optical and SAR imagery (BRIGHT) to support AI-based all-weather disaster response. To the best of our knowledge, BRIGHT is the first open-access, globally distributed, event-diverse MM dataset specifically curated to support AI-based disaster response. It covers five types of natural disasters and two types of man-made disasters across 12 regions worldwide, with a particular focus on developing countries where external assistance is most needed. The optical and SAR imagery in BRIGHT, with a spatial resolution between 0.3-1 meters, provides detailed representations of individual buildings, making it ideal for precise BDA. In our experiments, we have tested seven advanced AI models trained with our BRIGHT to validate the transferability and robustness. The dataset and code are available at https://github.com/ChenHongruixuan/BRIGHT. BRIGHT also serves as the official dataset for the 2025 IEEE GRSS Data Fusion Contest.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- Africa > Middle East > Libya > Derna District > Derna (0.05)
- (27 more...)
A physics-guided neural network for flooding area detection using SAR imagery and local river gauge observations
Gierszewska, Monika, Berezowski, Tomasz
The flooding extent area in a river valley is related to river gauge observations. The higher the water elevation, the larger the flooding area. Due to synthetic aperture radar\textquoteright s (SAR) capabilities to penetrate through clouds, radar images have been commonly used to estimate flooding extent area with various methods, from simple thresholding to deep learning models. In this study, we propose a physics-guided neural network for flooding area detection. Our approach takes as input data the Sentinel 1 time-series images and the water elevations in the river assigned to each image. We apply the Pearson correlation coefficient between the predicted sum of water extent areas and the local water level observations of river water elevations as the loss function. The effectiveness of our method is evaluated in five different study areas by comparing the predicted water maps with reference water maps obtained from digital terrain models and optical satellite images. The highest Intersection over Union (IoU) score achieved by our models was 0.89 for the water class and 0.96 for the non-water class. Additionally, we compared the results with other unsupervised methods. The proposed neural network provided a higher IoU than the other methods, especially for SAR images registered during low water elevation in the river.
- South America > Brazil (0.05)
- Europe > Italy (0.05)
- Europe > United Kingdom > England (0.05)
- (7 more...)
Combining Indigenous knowledge and AI to support safer on-ice travel
Warming temperatures mean shorter ice seasons in Sanikiluaq, Nunavut. As a result, the stretches of landfast ice formed from frozen seawater that Inuit use to travel and hunt on are increasingly unpredictable and unsafe. Polynyas, areas of open water and thin ice, occur where ocean currents or wind prevent pack ice from forming. They're typically found in the same locations each year enabling travellers to plan their routes safely. But climate change is affecting this predictability, causing smaller, unexpected polynyas that make travelling across the pack ice risky.
- North America > Canada > Nunavut (0.25)
- Atlantic Ocean > North Atlantic Ocean > Hudson Bay (0.05)